Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for ground bv2nat and int2bv reasoning #733

Merged
merged 3 commits into from
Jul 19, 2023

Conversation

bclement-ocp
Copy link
Collaborator

This patch adds support for "combined" reasoning about the int2bv and bv2nat symbols in the arith and bitv reasoners.

More specifically:

  • The constructors for bv2nat and int2bv in expr.ml are now "smart" and know that they are related, so that we effectivly have the following rewrite rules:

    (bv2nat ((_ int2bv n) x)) -> (mod x (pow 2 n))
    ((_ int2bv n) (bv2nat (as x (_ BitVec m))))
    -> (ite (> m n) (extract (- n 1) 0 x) (zero_extend (- m n) x))

    These are added early in the bv2nat and int2bv constructors to ensure that we never actually build ((_ int2bv n) (bv2nat e)) or (bv2nat ((_ int2bv n) e)) terms. This is actualy useful because the simplifications done for bv2nat (see below) are more powerful than what int2bv can recover, and there would be information loss otherwise e.g. for ((_ int2bv n) (bv2nat (concat e1 e2))) which would become

    ((_ int2bv n) (+ (* (bv2nat e1) (pow 2 m)) (bv2nat e2)))

    instead of

    (concat e1 e2)

    (assuming that e2 is of type (_ BitVec m) and e1 of type (_ BitVec |n - m|)).

  • The arith and bitv reasoners perform constant propagation of int2bv and bv2nat

  • The arith reasoner performs the ((_ int2bv n) (bv2nat x)) conversion more aggressivey, i.e. it performs the conversion for terms of the form ((_ int2bv n) e) where the semantic value of e is of the form (bv2nat e')

    More complex transformations, such as transforming (int2bv (* 2 (bv2nat e))) into (concat e #b0), are not supported.

  • The bitv reasoner also performs the (bv2nat ((_ int2bv n) e)) conversion more aggressively, i.e. for terms that reduce to ((_ int2bv n) e) after simplification.

  • The bitv reasoner performs additional simplifications on bv2nat terms after normalization. For instance, (bv2nat (concat x y)) becomes (+ (* (pow 2 n) (bv2nat x)) (bv2nat y)) when y is a bit-vector of size n, and (bv2nat (extract j i x)) becomes (mod (div (bv2nat x) (pow 2 i)) (pow 2 (+ (- j i) 1)))

  • Finally, the bitv reasoner requires that the result of bv2nat for a bitvector of size n is in the range 0 ..< 2^n

Combined, this allows exploiting the arith reasoner for a variety of problems mostly involving bitvector arithmetic with constants. In particular, we are able to handle many aritmetic expressions involving bv2nat.

On the other hand, bitvector expressions involving int2bv have more limited support. Indeed, not only does int2bv perform less simplifications, we are currently not able to exploit the relevant equivalence:

int2bv[n](x) = int2bv[n](y) iff x = y (mod 2^n)

This patch adds support for "combined" reasoning about the int2bv and
bv2nat symbols in the arith and bitv reasoners.

More specifically:

 - The constructors for bv2nat and int2bv in `expr.ml` are now "smart"
   and know that they are related, so that we effectivly have the
   following rewrite rules:

    `(bv2nat ((_ int2bv n) x))` -> `(mod x (pow 2 n))`
    `((_ int2bv n) (bv2nat (as x (_ BitVec m))))`
      -> `(ite (> m n) (extract (- n 1) 0 x) (zero_extend (- m n) x))`

   These are added early in the bv2nat and int2bv constructors to ensure
   that we *never* actually build `((_ int2bv n) (bv2nat e))` or
   `(bv2nat ((_ int2bv n) e))` terms. This is actualy useful because the
   simplifications done for `bv2nat` (see below) are more powerful than
   what `int2bv` can recover, and there would be information loss
   otherwise e.g. for `((_ int2bv n) (bv2nat (concat e1 e2)))` which
   would become

     `((_ int2bv n) (+ (* (bv2nat e1) (pow 2 m)) (bv2nat e2)))`

   instead of

      `(concat e1 e2)`

   (assuming that `e2` is of type `(_ BitVec m)` and `e1` of type
   `(_ BitVec |n - m|)`).

 - The arith and bitv reasoners perform constant propagation of int2bv
   and bv2nat

 - The arith reasoner performs the `((_ int2bv n) (bv2nat x))`
   conversion more aggressivey, i.e. it performs the conversion for
   terms of the form `((_ int2bv n) e)` where the semantic value of `e`
   is of the form `(bv2nat e')`

   More complex transformations, such as transforming
   `(int2bv (* 2 (bv2nat e)))` into `(concat e #b0)`, are not
   supported.

 - The bitv reasoner also performs the `(bv2nat ((_ int2bv n) e))`
   conversion more aggressively, i.e. for terms that reduce to
   `((_ int2bv n) e)` after simplification.

 - The bitv reasoner performs additional simplifications on `bv2nat`
   terms after normalization. For instance, `(bv2nat (concat x y))`
   becomes `(+ (* (pow 2 n) (bv2nat x)) (bv2nat y))` when `y` is a
   bit-vector of size `n`, and `(bv2nat (extract j i x))` becomes
   `(mod (div (bv2nat x) (pow 2 i)) (pow 2 (+ (- j i) 1)))`

 - Finally, the bitv reasoner requires that the result of `bv2nat` for a
   bitvector of size `n` is in the range `0 ..< 2^n`

Combined, this allows exploiting the arith reasoner for a variety of
problems mostly involving bitvector arithmetic with constants. In
particular, we are able to handle many aritmetic expressions involving
`bv2nat`.

On the other hand, bitvector expressions involving `int2bv` have more
limited support. Indeed, not only does `int2bv` perform less
simplifications, we are currently not able to exploit the relevant
equivalence:

  `int2bv[n](x) = int2bv[n](y) iff x = y  (mod 2^n)`
@bclement-ocp
Copy link
Collaborator Author

Note: I am re-running benches on QF_BV. There was an issue with the old canonizer where we could have bv terms as aliens; the new canonizer (from #732 ) no longer has that issue.

But the tests are still very slow b/c there are many timeouts… there are 42k tests, and if they all time out, if would take 4h to run all the tests. In practice it looks like it's more around half of them timing out?

@bclement-ocp
Copy link
Collaborator Author

This is +1272-1843 which I assume is mostly due to #736 but still is a big drop.

Although there is also a relatively high number of segmentation faults (I tracked one to Z.to_string from ZArith on marvin, but couldn't reproduce locally) so… looks like something fishy is going on, maybe with the gmp version on marvin?

@bclement-ocp
Copy link
Collaborator Author

Although there is also a relatively high number of segmentation faults (I tracked one to Z.to_string from ZArith on marvin, but couldn't reproduce locally) so… looks like something fishy is going on, maybe with the gmp version on marvin?

Turns out the Z.to_string issue was spurious and the actual is a stack overflow because `bv2nat2 is not an bitvector but an integer and we were looping. Rerunning the tests.

@bclement-ocp
Copy link
Collaborator Author

This is now +1541-1693 (still on QF_BV) which is more acceptable, but we are still losing a lot. Maybe related to #736? On the other hand, we want to compute on ground problems!

(Note that, now that I think about it, this is probably a big part of the issue --- due to technical limitations we are currenty only able to simplify e.g. (bvadd #b01 #b01) into #b10 but not (bvadd x x) into #b10 in a context where we know x = #b01)

@bclement-ocp
Copy link
Collaborator Author

(Note that, now that I think about it, this is probably a big part of the issue --- due to technical limitations we are currenty only able to simplify e.g. (bvadd #b01 #b01) into #b10 but not (bvadd x x) into #b10 in a context where we know x = #b01)

Which I just figured out how to do, will try to make a PR soon.

@bclement-ocp
Copy link
Collaborator Author

(Note that, now that I think about it, this is probably a big part of the issue --- due to technical limitations we are currenty only able to simplify e.g. (bvadd #b01 #b01) into #b10 but not (bvadd x x) into #b10 in a context where we know x = #b01)

Which I just figured out how to do, will try to make a PR soon.

Well, actually no, because my solution would break solve.

@Halbaroth
Copy link
Collaborator

Can I merge this PR?

@bclement-ocp
Copy link
Collaborator Author

Yes let's merge this after our discussion

@bclement-ocp bclement-ocp merged commit 4bf5b1a into OCamlPro:next Jul 19, 2023
9 of 10 checks passed
bclement-ocp added a commit to bclement-ocp/alt-ergo that referenced this pull request Nov 20, 2023
It turns out that OCamlPro#881 revealed
a latent issue with the implementation in
OCamlPro#733

When we encounter a [bv2nat] term, we create a *term* that represents
the [bv2nat] computation, then call [X.make] on the result. This has the
unfortunate effect that the sub-terms of that new term are not added to
the Union-Find (or at least not before the term itself). This is bad,
but was harmless at the time.

However, the resulting term from a call of [bv2nat] includes divisions.
And since OCamlPro#881 divisions are
represented by an uninterpreted term rather than an existential variable
— and [IntervalCalculus.add] inspects that term, rightfully expecting
that its sub-terms have been added to the Union-Find, which in this case
they are not, and so we crash.

This patch fixes the issue by introducing an uninterpreted term for the
result of [bv2nat] and adding the equality between that uninterpreted
term and the [bv2nat] computation. This equality is then processed as a
term on its own, ensuring that subterms are added to the Union-Find
before the terms that contain them.

Note that long-term, we probably want to get rid of the construction of
[bv2nat] terms directly in [X.make] in favor of propagators in a
follow-up to OCamlPro#944 . This would
avoid needing to go through terms for this.
bclement-ocp added a commit to bclement-ocp/alt-ergo that referenced this pull request Nov 20, 2023
It turns out that OCamlPro#881 revealed
a latent issue with the implementation in
OCamlPro#733

When we encounter a [bv2nat] term, we create a *term* that represents
the [bv2nat] computation, then call [X.make] on the result. This has the
unfortunate effect that the sub-terms of that new term are not added to
the Union-Find (or at least not before the term itself). This is bad,
but was harmless at the time.

However, the resulting term from a call of [bv2nat] includes divisions.
And since OCamlPro#881 divisions are
represented by an uninterpreted term rather than an existential variable
— and [IntervalCalculus.add] inspects that term, rightfully expecting
that its sub-terms have been added to the Union-Find, which in this case
they are not, and so we crash.

This patch fixes the issue by introducing an uninterpreted term for the
result of [bv2nat] and adding the equality between that uninterpreted
term and the [bv2nat] computation. This equality is then processed as a
term on its own, ensuring that subterms are added to the Union-Find
before the terms that contain them.

Note that long-term, we probably want to get rid of the construction of
[bv2nat] terms directly in [X.make] in favor of propagators in a
follow-up to OCamlPro#944 . This would
avoid needing to go through terms for this.
bclement-ocp added a commit that referenced this pull request Nov 22, 2023
It turns out that #881 revealed
a latent issue with the implementation in
#733

When we encounter a [bv2nat] term, we create a *term* that represents
the [bv2nat] computation, then call [X.make] on the result. This has the
unfortunate effect that the sub-terms of that new term are not added to
the Union-Find (or at least not before the term itself). This is bad,
but was harmless at the time.

However, the resulting term from a call of [bv2nat] includes divisions.
And since #881 divisions are
represented by an uninterpreted term rather than an existential variable
— and [IntervalCalculus.add] inspects that term, rightfully expecting
that its sub-terms have been added to the Union-Find, which in this case
they are not, and so we crash.

This patch fixes the issue by introducing an uninterpreted term for the
result of [bv2nat] and adding the equality between that uninterpreted
term and the [bv2nat] computation. This equality is then processed as a
term on its own, ensuring that subterms are added to the Union-Find
before the terms that contain them.

Note that long-term, we probably want to get rid of the construction of
[bv2nat] terms directly in [X.make] in favor of propagators in a
follow-up to #944 . This would
avoid needing to go through terms for this.
@bclement-ocp bclement-ocp deleted the bclement/bvint branch January 3, 2024 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants